(SQL Server Integration Services)
SQL Server Integration Service is an ETL tool. By using SSIS we can create a data transformation service (Extract data from various operational sources like Excel, flat files, SQL Server, Oracle, etc). transform the source business data by using existing transformation in the staging area of the transformation area and load and store it into the destination database or file system.
SSIS Architecture is categorized into two components
The SSIB Runtime engine completely handles the control flow of the package.
Control flow: The control flow of a package defines actions that are to be executed when the package run. The control flow contains various tasks and containers as well.
Task: A unit of work in a workflow.
For example, Data flow task, execute SQL task, etc.
Container: Container is used to divide the package into multiple blocks.
For example: For loop container, for each look container, sequence container, and task post container.
The data flow transformation pipeline engine completely handles the data flow of the package. The data flow contains, data flow source (excel source, flat file, OLEDB, etc), data flow transformations (conditional split transformation) derived column transformation, lookup. Transformation etc.) and data flow destination.
Note: Whenever the data flow task occurs in control flow the SSIS Runtime engine throws the control from control flow to data flow to achieve or to run the ETL process while the package is running.
Connection Manager: A logical connection between SSIS application and database or file system.
Note: Connection Manager can be established by using various providers in SSIS.
Package: Package is a core component of SSIS. The package can be created simple graphical user interphases or programmatically Data conversion transformation
Data conversion transformation is used to concern the data men one data type to another data type and also adds a new column to the data set.
Steps to configure data conversion transformation
Start----->
All Programmes----->
Select Microsoft SQL server 2005----->
(to implement all the BI activities)----->
Select file menu----->
Select New----->
Select Project----->
Business intelligence project options under project types----->
Integration services project under templates section----->
Change the project name as (evening 7:30 Batch)----->
Change the project location----->
Click ok----->
Select package .dtsx in solution explorer and rename it as data conversation .dtsx----->
In control flow, drag and drop the data flow task and rename it as data flow task space data conversion----->
Select the data flow task and right-click and select the edit option from the right-click popup menu.----->
In the data flow tab drag and drop OLEDB Source----->
Double click on OLEDB Source to edit it.----->
In the connection manager page, click new to create a new connection manager.----->
Click new-----> Provide server name (local hast (or). (or) server name)----->
Select data base name (Adventure works from the drop-down list)----->
Click test connection to evaluate the connection----->
Click ok----->
Click ok----->
Select table or view option from data access mode drop-down list----->
Select human resources, employee table name from the drop-down list----->
Inclined to build a profession as MSBI Developer? Then here is the blog post on, explore MSBI Training
MISSING LINE PG:16 ------>
Click ok ------>
Drag and drop data conversion transformation and make a connection from OLEDB source to data conversion transformation
Double click on data conversion transformation ------>
Check title and marital status from available input columns and change the data type from to string (Dt – STR) and also rename lias columns as title DC and Marital click ok to save the changes. ------>
Drop OLEDB destination from the data section. ------>
Ke MISSING LINE PG:17 connection from data conversion transformation MISSING EDB destination. ------>
Double click On OLEDB destination ------>
The connection manager page clicks new to create MISSING LINE, 17 connection manager. ------>
New ------>
Driver destination server name (Local host) and MISSING LINE PG:17 ture works data base from the drop MISSING LINE PG:17 ------> Test connection ------>
Click ok ------>
Click ok ------>
Click new to create destination table remove MISSING LINE PG:17 title and copy of marital status MISSING LINE PG:17 me the table as converted data. ------>
Click ok ------>
MISSING LINE PG:17 Options from left panel to make MISSING LINE PG:17 between input columns or source columns
In Business intelligence development studio, Alt+ control + L for solution explorer, select data conversion .dtsx package
Right-click select execute the package
Derived column transformation:
The derived column transformation enables in-line transformations using SSIS expressions to transform the data. The typical user of Derived column transformation is to create or drive new columns by using the existing source columns or by using variable or by using available functions.
Start ------>
All programs ------>
Microsoft SQL Server 2005 ------>
Select the Microsoft business intelligence development studio. ------>
Select file menu ------>
Select new ------>
Projects ------>
Select business Intelligence Projects option ------>
Integration Services project ------>
Change project location and name ------>
Click ok ------>
In the Business Intelligence Development Studio, in the control flow drag and drop data flow task and rename it is the data flow task space derived column. ------>
Double click on the data flow task to edit it ------>
In the data flow, drag and drop OLEDB source ------>
Double click on OLEDB source to configure it ------>
Select click new to create source connection manager. ------>
Click new ------>
In the connection manager editor provide server name (Local host (or) (or) server name) ------>
Select adventure works from the drop down list ------>
Click ok twice ------>
Select human resources employee address table from the drop down list ------>
Select columns from left panel ------>
Click ok ------>
Open SSMS (SQL Server Management Studio) and run the following query to create destination table. ------>
Create table [Derived column](
[Employee ID] Integer,
[Address ID] Integer,
[Row guid] Unique Identifier,
[Modified data] DATE-TIME,
[Ref Date DATE TIME)
Go to Business Intelligence development studio and drag and drop derived column transformation and make a connection from OLEDB source to derived column using a green data flow path. ------>
Double click on derived column ------>
Define the following expression
Derived column name Expression Data type Length
Ref date (DT_DB Date) get dated DT-DB date
Note: The above expression is defined by drag and drop get date function from the date-time functions section and remove the derived column 1 as Ref date the some will be carry far worded to the destination in our scenario.
Execute SQL Task is used to execute relational queries such as DDL, DML against connection.
In executing the SQL task connection is nothing but connection Manager. Provide the following steps to create connection manager in execute SQL task.
Open Business Intelligence Development studio ------>
Create a new package and rename it as execute SQL .dtsx ------>
In control flow drag and drop execute SQL task on to design area ------>
Double click on execute SQL task to edit it ------>
Provide the following steps ------>
Select new connection ------>
Click new ------>
Provide server name (Localhost or) (or) server name) ------>
Select an adventure work database from the drop-down list. ------>
Click test connection to evaluate the connection between the database and SSIS ------>
Click ok ------>
Click ok ------>
SQL source type – Select direct Input (default)
Note: here, we have 3 options
Truncate table any valid table name
Click ok to save the changes ------>
In solution explorer (ALT+ctrl=L), select execute SQL dtsx ------>
Right-click and select execute package option (2nd option)
The execute package task is used to execute a package with in the parent package.
Open Business Intelligence Development studio in solution explorer create a new package and rename it is exec pkg. dtsx. ------>
In control flow, drag and drop execute package task ------>
Rename the execute package task as EPT calling exec SQL, package ------>
Double click on execute package task to configure it ------>
Select package option from left pane and set the following properties ------>
Location – select file system ------>
Connection – select new connection ------>
Click browse ------>
Navigate to the path where the package is stored in, ------>
Select execute SQL dtsx ------>
Click open ------>
Click ok ------>
* ------>
In solution explorer, select execute package dtsx ------>
Right click and select execute package option. ------>
The linked package will be automatically set open by the current package and then executes.
In SSIS the variables are categorized into two parts
System Variables are built-in variables and system variables can be accessed throughout the package
For Example :- System :: (reation name
System: :Package Name
System:: Task ID etc.,
Note: System Defined variables can be identified system scope resolution operated (:J. That means all system variables should start with the system::
User-Defined variable cab be created by the developer and user-defined variable can have its own name, data type, value, and scope as well
Note: User-Defined variables can be identified by USER:: variable name
Example: Package for excel source
Open Business intelligence development studio ------>
Create new package and rename it as excel source .dtsx ------>
In control flow drag and drop data flow task ------>
Double click on the data flow task to configure it ------>
In the Data flow, drag and drop excel source ------>
Double click on excel source to configure it ------>
Click new to create a connection manager for excel ------>
Click browse and select excel alliance details excel file and click open ------>
Ensure that the first row has column names checkbox is checked ------>
Click ok ------>
Select data access mode as table or view ------>
Select sheet 1 from drop-down list ------>
Select columns ------>
Click ok
Note: Prepare the following excel file i.e already linked or connected to excel source
Source code | SBAT Type | Partner type | Funded Amount |
81818 | MSA | Builder | 50000 |
81540 | B1 | Realtor | 40000 |
12345 | MAP | Realtor | 9000 |
The Marge Transformation combines two sorted data sets into single output based on values in their key columns. This transformation requires that the inputs or sources are sorted and then Merged columns must have the same data type.
Open business intelligence development studio ------>
In the integration services project, create a new package and rename it as merge .dtsx ------>
In control flow drag and drop data flow task and rename it as data flow task merge. ------>
In data flow drag and drop OLEDB source rename it as source 1 ------>
Double click on source 1 to configure it. ------>
Select provided connection manager it exists ------>
Select Human Resources. Employee table from drop down list ------>
Select columns from the left pane and click ok to save changes ------>
Right-click on source 1 and select show advanced editor and set the following properties, ------>
Select the input and output properties tab, ------>
Selected, OLEDB source output and set, ------>
Expand OLEDB source output and also expand output columns. ------>
Select 1st column through which we are going to make a mapping between two sources (employee ID) and set, ------>
Sort key portion - 1 |
Click Refresh ------>
Click ok ------>
Drag and drop another OLEDB source and rename it as source 2 ------>
Double click on source 2 provide connection manager if exists ------>
Select Human Resources. Employee address form drop-down list ------>
Select columns ------>
Click ok ------>
Right-click on source 2 select advanced editor to sort the data and set the following properties ------>
Select the input and output properties tab, ------>
Select OLEDB source output and set, ------>
1s sorted – true ------>
Expand OLEDB source output and also expand output columns, ------>
Select the 1st column through which we are going to make a mapping between two sources (employee ID) and set, ------>
Sort key portion -1 ------>
Click refresh ------>
Click ok ------>
Drag and drop merge transformation ------>
Make a connection from source 1 to merge and select merge input 1 option in input, output selection editor ------>
Click ok ------>
Make a connection from source 2 to merge ------>
Double click on merge to make sure that all columns are mapped. ------>
Drag and drop OLEDB destination, make a connection from merge to OLEDB destination ------>
Double click on OLEDB destination provide destination connection manager and click new to create destination table and rename the table as merged data ------>
Click ok ------>
Select mappings ------>
Click ok ------>
In solution explorer select the package and select execute package ------>
The merge join transformation combines two sorted data sets into single output using inner join *default join), left outer and full outer joins.
Configure 2 OLEDB sources (Source 1, source 2) from the previous example
Drag and drop merge join transformation ------>
Make a connection from source 1 to merge join and select merge join left input option from input-output selection editor ------>
Make a connection from source 2 to merge join transformation ------>
Double click on merge join and select left outer join as join type and select the following columns from both source 1 and source 2 ------>
Click ok ------>
Drag and drop OLEDB destination and make a connection from merge join to destination. ------>
Double click on OLEDB destination ------>
provide destination connection manager if exists and click new to create the destination table. ------>
Rename the OLEDB destination as merge join data. ------>
Click ok ------>
Execute package
Union All transformation: It combines multiple inputs into a single output. It differs from merge and MISSING LINE PG:30 transformation because union all doesn’t required sorted input. However the first input is the reference input that all subsequent inputs must match the following criteria.
It routes the input data to different outputs based on case conditions if no case (or) conditions are met the data must be routed to the default output. The implementation of a conditional split is similar to case decision structure in general programming languages (switch case)
To test the package whether it is successfully executed (or) fail using conditional split, derived column transformation and union all.
Open Business intelligence development studio. ------>
Create a new package and rename it as a test. dtsx ------>
In the control flow, drag, and drop the data flow task. ------>
In the data flow, drag and drop MISSING LINE PG:31 ------>
Double click on OLEDB source to configure it. ------>
Provide connection manager if exists ------>
Select Human Resources. Employee table from the DDL. ------>
Select columns from left panel ------>
Clock ok ------>
Find out SRC row count ------>
Make a connection from SRC to row count ------>
Double click on Row count to edit it. ------>
In component properties tab, provide the following property ------>
Custom properties ------>
Variable name – uv src count ------>
Click refresh and click ok ------>
Drag and drop derived column transformation and make a connection from Row count to derived column ------>
Double click on derived column to define the execution date provide the following expression. Execution date – get date() ------>
Click ok ------>
Drag and drop row count transformation and rename as RC Dest count ------>
Make a connection from the derived column to Row count. ------>
Double click on Row count and provide the following property ------>
Custom properties ------>
Variable name – UNDST count ------>
Click refresh ------>
Click ok ------>
Drag and drop OLEDB destination and make a connection from row count to destination ------>
Double click on a destination to configure it. ------>
Provide destination connection manager if exists. ------>
Click new to create a destination table if it is not existing and rename the destination table as tested – data. ------>
Click ok ------>
Select mappings ------>
Click ok ------>
Note: Define the following variables in control flow write package.
Name ā | Data Type | Value |
UVsre count | Int 32 | |
UVDst count | Int 32 | |
UV Solution Name | String | Morning 8:30 batch |
UV table Name | String | Tested Data |
In control flow, drag and drop data flow task ------>
Rename it as Data flow task test condition ------>
Double click on data flow task ------>
In the data flow, drag and drop OLEDB source ------>
Double click on OLEDB src to configure it ------>
Provide an src connection manager if exists and set the following properties. ------>
Data access mode – select SQL to command text – Provide the following query to fetch execute date, ------>
Select distinct get ate () as (execution date) from (tested – data) ------>
Drag and drop derived column transformation to derive the following columns using the existing variables. ------>
Make a connection from OLEDB src to the derived column. ------>
Double click on derived column ------>
Solution name @ ( user:: UV solution name) ------>
Package name @ (syst_) :: Package name) ------>
Table name @ (user :: UV table name) ------>
Source count @ (user :: UV src count) ------>
Destination count @ (user :: UV dst count) ------>
Click ok ------>
Drag and drop, conditional split transformation to check the condition ------>
Make a connection form derived column to conditional split. ------>
In condition split transformation editor, provide the following condition. ------>
Output Name Condition
Case 1 (source count)
(Destination count)
Rename case 1 as src count is equal to Dst count ------>
Rename conditional split default output as src count is not equal to Dct count ------>
Click ok ------>
Drag and drop derived column transformation and make a connection from conditional split to the derived column. ------>
Select src count is equal to dst count from input/output editor. ------>
Rename the derived column 1 as success status. ------>
Double click on the derived column and derive the following expression ------>
Derived column name Expression
Status “success”
Click ok ------>
Drag and drop derived column transformation and rename it as failure status. ------>
Make a connection from the conditional split to the derived column. ------>
Double click on failure status to define the status. ------>
Derived column name Expression
Status “Failure”
Click ok ------>
Drag and drop union all transformation and make a connection from success status to union all and also make a connection from failure to union all. ------>
Drag and drop OLEDB destination to capture the log information ------>
Make a connection from union all to destination ------>
Double click on a destination to configure it. ------>
Provide destination connection manager it exists ------>
Click new to create new destination table and rename it as SSIS_Log ------>
Click ok ------>
Select mappings ------>
Click ok ------>
Execute package ------>
You liked the article?
Like: 0
Vote for difficulty
Current difficulty (Avg): Medium
TekSlate is the best online training provider in delivering world-class IT skills to individuals and corporates from all parts of the globe. We are proven experts in accumulating every need of an IT skills upgrade aspirant and have delivered excellent services. We aim to bring you all the essentials to learn and master new technologies in the market with our articles, blogs, and videos. Build your career success with us, enhancing most in-demand skills in the market.